-
Notifications
You must be signed in to change notification settings - Fork 8
src: clamp idle_diff before computing proc_diff #372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds defensive clamping in event-loop metric calculations: clamp Changes
Sequence Diagram(s)sequenceDiagram
participant UV as uv_metrics_cb_
participant Stats as get_event_loop_stats_
participant State as loop vars/times
UV->>State: read loop_idle_time, loop_duration
alt loop_idle_time > loop_duration
UV->>State: loop_idle_time = loop_duration
Note right of State: clamp to prevent unsigned underflow
end
Stats->>State: read loop_diff, idle_diff
alt idle_diff > loop_diff
Stats->>State: idle_diff = loop_diff
Note right of State: clamp to avoid bogus utilization
end
Stats->>State: compute proc_diff and utilization metrics
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/nsolid/nsolid_api.cc(2 hunks)
🔇 Additional comments (1)
src/nsolid/nsolid_api.cc (1)
2165-2171: LGTM! Clamping prevents unsigned underflow.The clamping logic correctly handles the clock skew issue where
idle_diffcan be slightly larger thanloop_diffon some platforms. This prevents unsigned underflow in theproc_diffcalculation on line 2171 and eliminates the bogus utilization values reported in issue #15.
src/nsolid/nsolid_api.cc
Outdated
| // Check current_hrtime_ >= exit_time to avoid unsigned underflow when | ||
| // exit_time is slightly larger due to clock skew (observed on Windows). | ||
| // Without this check, the subtraction can produce bogus values in the | ||
| // millions of seconds range. | ||
| fprintf(stderr, "[DEBUG] exit_time=%llu, prev_call_time=%llu, current_hrtime=%llu, rolling_est_lag=%f\n", | ||
| (unsigned long long)exit_time, | ||
| (unsigned long long)stor->prev_call_time_, | ||
| (unsigned long long)stor->current_hrtime_, | ||
| envinst->rolling_est_lag_); | ||
| stor->loop_estimated_lag = | ||
| (exit_time > 0 && stor->prev_call_time_ > exit_time) ? | ||
| (exit_time > 0 && stor->prev_call_time_ > exit_time && | ||
| stor->current_hrtime_ >= exit_time) ? | ||
| (stor->current_hrtime_ - exit_time) / 1e6 : envinst->rolling_est_lag_; | ||
| fprintf(stderr, "[DEBUG] loop_estimated_lag=%f (condition: exit_time>0=%d, prev>exit=%d, current>=exit=%d)\n", | ||
| stor->loop_estimated_lag, | ||
| exit_time > 0, | ||
| stor->prev_call_time_ > exit_time, | ||
| stor->current_hrtime_ >= exit_time); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove debug logging before merging.
The additional guard stor->current_hrtime_ >= exit_time correctly prevents unsigned underflow, but the debug fprintf statements on lines 2206-2210 and 2215-2219 must be removed before merging to production:
- These debug logs will pollute stderr in production environments
- Performance concern: metrics collection runs frequently (every event loop iteration)
- Portability:
%llumay not be correct foruint64_ton all platforms (considerPRIu64from<cinttypes>)
Apply this diff to remove the debug logging:
- fprintf(stderr, "[DEBUG] exit_time=%llu, prev_call_time=%llu, current_hrtime=%llu, rolling_est_lag=%f\n",
- (unsigned long long)exit_time,
- (unsigned long long)stor->prev_call_time_,
- (unsigned long long)stor->current_hrtime_,
- envinst->rolling_est_lag_);
stor->loop_estimated_lag =
(exit_time > 0 && stor->prev_call_time_ > exit_time &&
stor->current_hrtime_ >= exit_time) ?
(stor->current_hrtime_ - exit_time) / 1e6 : envinst->rolling_est_lag_;
- fprintf(stderr, "[DEBUG] loop_estimated_lag=%f (condition: exit_time>0=%d, prev>exit=%d, current>=exit=%d)\n",
- stor->loop_estimated_lag,
- exit_time > 0,
- stor->prev_call_time_ > exit_time,
- stor->current_hrtime_ >= exit_time);
stor->loop_idle_percent = (1 - stor->loop_utilization) * 100;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Check current_hrtime_ >= exit_time to avoid unsigned underflow when | |
| // exit_time is slightly larger due to clock skew (observed on Windows). | |
| // Without this check, the subtraction can produce bogus values in the | |
| // millions of seconds range. | |
| fprintf(stderr, "[DEBUG] exit_time=%llu, prev_call_time=%llu, current_hrtime=%llu, rolling_est_lag=%f\n", | |
| (unsigned long long)exit_time, | |
| (unsigned long long)stor->prev_call_time_, | |
| (unsigned long long)stor->current_hrtime_, | |
| envinst->rolling_est_lag_); | |
| stor->loop_estimated_lag = | |
| (exit_time > 0 && stor->prev_call_time_ > exit_time) ? | |
| (exit_time > 0 && stor->prev_call_time_ > exit_time && | |
| stor->current_hrtime_ >= exit_time) ? | |
| (stor->current_hrtime_ - exit_time) / 1e6 : envinst->rolling_est_lag_; | |
| fprintf(stderr, "[DEBUG] loop_estimated_lag=%f (condition: exit_time>0=%d, prev>exit=%d, current>=exit=%d)\n", | |
| stor->loop_estimated_lag, | |
| exit_time > 0, | |
| stor->prev_call_time_ > exit_time, | |
| stor->current_hrtime_ >= exit_time); | |
| // Check current_hrtime_ >= exit_time to avoid unsigned underflow when | |
| // exit_time is slightly larger due to clock skew (observed on Windows). | |
| // Without this check, the subtraction can produce bogus values in the | |
| // millions of seconds range. | |
| stor->loop_estimated_lag = | |
| (exit_time > 0 && stor->prev_call_time_ > exit_time && | |
| stor->current_hrtime_ >= exit_time) ? | |
| (stor->current_hrtime_ - exit_time) / 1e6 : envinst->rolling_est_lag_; | |
| stor->loop_idle_percent = (1 - stor->loop_utilization) * 100; |
🤖 Prompt for AI Agents
In src/nsolid/nsolid_api.cc around lines 2202 to 2219, remove the two debug
fprintf blocks (the one printing
exit_time/prev_call_time/current_hrtime/rolling_est_lag and the one printing
loop_estimated_lag and the condition flags) so they don't pollute stderr or
impact performance; keep the added guard (stor->current_hrtime_ >= exit_time)
and the loop_estimated_lag computation unchanged, and if any future logging is
needed use a proper logging macro or PRIu64 via <cinttypes> rather than raw
fprintf with %llu.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not anymore in the PR code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if the debug fprintf statements are still present in the current PR branch
grep -n "fprintf(stderr, \"\[DEBUG\]" src/nsolid/nsolid_api.cc || echo "No debug fprintf statements found"
Length of output: 33
@santigimeno Confirmed! I've verified that the debug fprintf statements have been removed from the current code. Thank you for addressing this concern. The core fix with the stor->current_hrtime_ >= exit_time guard to prevent unsigned underflow looks good. 👍
🧬 Code Graph Analysis Results
-
src/nsolid/nsolid_api.cc: UpdateConfig (EnvList::UpdateConfig)
- Summary: Merges a new JSON config into current_config_, detects diffs, and if there are changes:
- increments current_config_version_
- toggles promise tracking if "promiseTracking" is present
- updates the continuous profiler if contCpuProfile/contCpuProfileInterval changed
- (re)starts OTLP, StatsD, and gRPC agents if "otlp", "statsd", or "grpc" keys are present
- updates info_.tags when "tags" changes
- enqueues the new config to on_config_string_q_ and signals process_callbacks_msg_
- Parameters:
- const nlohmann::json& config: new configuration to merge
- Returns: void (performs updates and queues callbacks as needed)
- Exceptions/Errors: uses CHECK/ASSERT patterns; on error, behavior follows existing CHECK_EQ paths (likely termination in debug builds)
- Summary: Merges a new JSON config into current_config_, detects diffs, and if there are changes:
-
src/nsolid/nsolid_api.cc: calculateKernelVersion helper and GetKernelVersion
- Summary: Linux-specific code to derive a numeric kernel version for telemetry/config purposes.
- calculateKernelVersion(): parses kernel version from /proc/version_signature on Ubuntu, or from uname() / release, handling Debian quirks and UNAME26 edge cases. Returns a version as major<<16 | minor<<8 | patch.
- GetKernelVersion(): caches the computed kernel_version (uint32_t) and returns it; Linux-specific path is guarded with #ifdef linux.
- Parameters:
- None (internal helpers use system information)
- Returns:
- uint32_t kernel_version (cached) on success; 0 if not Linux or unavailable
- Exceptions/Errors:
- Reads files and uses sscanf/strstr; on errors it returns 0 or 0-equivalent version as fallback.
- Summary: Linux-specific code to derive a numeric kernel version for telemetry/config purposes.
Note: The above entries point to code that interacts with the PR-related changes in src/nsolid/nsolid_api.cc, specifically around dynamic reconfiguration (UpdateConfig) and host/kernel environment probing (GetKernelVersion/calculateKernelVersion). If you want, I can extract the exact code blocks for each item into a compact snippet with precise line references.
5666509 to
890194b
Compare
Clamp idle time values before subtracting from loop duration to avoid unsigned underflow when clock skew causes idle time to be slightly larger than total loop time. This manifests on Windows when the loop is nearly 100% idle. Applied in two locations: - uv_metrics_cb_: prevents bogus rolling_est_lag_ values - get_event_loop_stats_: prevents bogus loop_utilization values Fixes: nodesource/nsolid-private#15
890194b to
a4e371e
Compare
Clamp idle time values before subtracting from loop duration to avoid unsigned underflow when clock skew causes idle time to be slightly larger than total loop time. This manifests on Windows when the loop is nearly 100% idle. Applied in two locations: - uv_metrics_cb_: prevents bogus rolling_est_lag_ values - get_event_loop_stats_: prevents bogus loop_utilization values Fixes: nodesource/nsolid-private#15 PR-URL: #372 Reviewed-By: Juan José Arboleda <[email protected]> Reviewed-By: Rafael Gonzaga <[email protected]>
|
Landed in cd09ca4. |
Clamp idle time values before subtracting from loop duration to avoid unsigned underflow when clock skew causes idle time to be slightly larger than total loop time. This manifests on Windows when the loop is nearly 100% idle. Applied in two locations: - uv_metrics_cb_: prevents bogus rolling_est_lag_ values - get_event_loop_stats_: prevents bogus loop_utilization values Fixes: nodesource/nsolid-private#15 PR-URL: #372 Reviewed-By: Juan José Arboleda <[email protected]> Reviewed-By: Rafael Gonzaga <[email protected]>
Clamp idle time values before subtracting from loop duration to avoid unsigned underflow when clock skew causes idle time to be slightly larger than total loop time. This manifests on Windows when the loop is nearly 100% idle. Applied in two locations: - uv_metrics_cb_: prevents bogus rolling_est_lag_ values - get_event_loop_stats_: prevents bogus loop_utilization values Fixes: nodesource/nsolid-private#15 PR-URL: #372 Reviewed-By: Juan José Arboleda <[email protected]> Reviewed-By: Rafael Gonzaga <[email protected]>
Clamp idle time values before subtracting from loop duration to avoid unsigned underflow when clock skew causes idle time to be slightly larger than total loop time. This manifests on Windows when the loop is nearly 100% idle. Applied in two locations: - uv_metrics_cb_: prevents bogus rolling_est_lag_ values - get_event_loop_stats_: prevents bogus loop_utilization values Fixes: nodesource/nsolid-private#15 PR-URL: #372 Reviewed-By: Juan José Arboleda <[email protected]> Reviewed-By: Rafael Gonzaga <[email protected]>
Avoid unsigned underflow when loop_diff is slightly larger than idle_diff due to clock skew on idle loops.
Fixes: https://github.com/nodesource/nsolid-private/issues/15
Summary by CodeRabbit